[Ride Home] Simon Willison: Things we learned about LLMs in 2024
Description
Due to overwhelming demand (>15x applications:slots), we are closing CFPs for AI Engineer Summit NYC today. Last call! Thanks, we’ll be reaching out to all shortly!
The world’s top AI blogger and friend of every pod, Simon Willison, dropped a monster 2024 recap: Things we learned about LLMs in 2024. Brian of the excellent TechMeme Ride Home pinged us for a connection and a special crossover episode, our first in 2025.
The target audience for this podcast is a tech-literate, but non-technical one. You can see Simon’s notes for AI Engineers in his World’s Fair Keynote.
Timestamp
* 00:00 Introduction and Guest Welcome
* 01:06 State of AI in 2025
* 01:43 Advancements in AI Models
* 03:59 Cost Efficiency in AI
* 06:16 Challenges and Competition in AI
* 17:15 AI Agents and Their Limitations
* 26:12 Multimodal AI and Future Prospects
* 35:29 Exploring Video Avatar Companies
* 36:24 AI Influencers and Their Future
* 37:12 Simplifying Content Creation with AI
* 38:30 The Importance of Credibility in AI
* 41:36 The Future of LLM User Interfaces
* 48:58 Local LLMs: A Growing Interest
* 01:07:22 AI Wearables: The Next Big Thing
* 01:10:16 Wrapping Up and Final Thoughts
Transcript
[00:00:00 ] Introduction and Guest Welcome
[00:00:00 ] Brian: Welcome to the first bonus episode of the Tech Meme Write Home for the year 2025. I'm your host as always, Brian McCullough. Listeners to the pod over the last year know that I have made a habit of quoting from Simon Willison when new stuff happens in AI from his blog. Simon has been, become a go to for many folks in terms of, you know, Analyzing things, criticizing things in the AI space.
[00:00:33 ] Brian: I've wanted to talk to you for a long time, Simon. So thank you for coming on the show. No, it's a privilege to be here. And the person that made this connection happen is our friend Swyx, who has been on the show back, even going back to the, the Twitter Spaces days but also an AI guru in, in their own right Swyx, thanks for coming on the show also.
[00:00:54 ] swyx (2): Thanks. I'm happy to be on and have been a regular listener, so just happy to [00:01:00 ] contribute as well.
[00:01:00 ] Brian: And a good friend of the pod, as they say. Alright, let's go right into it.
[00:01:06 ] State of AI in 2025
[00:01:06 ] Brian: Simon, I'm going to do the most unfair, broad question first, so let's get it out of the way. The year 2025. Broadly, what is the state of AI as we begin this year?
[00:01:20 ] Brian: Whatever you want to say, I don't want to lead the witness.
[00:01:22 ] Simon: Wow. So many things, right? I mean, the big thing is everything's got really good and fast and cheap. Like, that was the trend throughout all of 2024. The good models got so much cheaper, they got so much faster, they got multimodal, right? The image stuff isn't even a surprise anymore.
[00:01:39 ] Simon: They're growing video, all of that kind of stuff. So that's all really exciting.
[00:01:43 ] Advancements in AI Models
[00:01:43 ] Simon: At the same time, they didn't get massively better than GPT 4, which was a bit of a surprise. So that's sort of one of the open questions is, are we going to see huge, but I kind of feel like that's a bit of a distraction because GPT 4, but way cheaper, much larger context lengths, and it [00:02:00 ] can do multimodal.
[00:02:01 ] Simon: is better, right? That's a better model, even if it's not.
[00:02:05 ] Brian: What people were expecting or hoping, maybe not expecting is not the right word, but hoping that we would see another step change, right? Right. From like GPT 2 to 3 to 4, we were expecting or hoping that maybe we were going to see the next evolution in that sort of, yeah.
[00:02:21 ] Brian: We
[00:02:21 ] Simon: did see that, but not in the way we expected. We thought the model was just going to get smarter, and instead we got. Massive drops in, drops in price. We got all of these new capabilities. You can talk to the things now, right? They can do simulated audio input, all of that kind of stuff. And so it's kind of, it's interesting to me that the models improved in all of these ways we weren't necessarily expecting.
[00:02:43 ] Simon: I didn't know it would be able to do an impersonation of Santa Claus, like a, you know, Talked to it through my phone and show it what I was seeing by the end of 2024. But yeah, we didn't get that GPT 5 step. And that's one of the big open questions is, is that actually just around the corner and we'll have a bunch of GPT 5 class models drop in the [00:03:00 ] next few months?
[00:03:00 ] Simon: Or is there a limit?
[00:03:03 ] Brian: If you were a betting man and wanted to put money on it, do you expect to see a phase change, step change in 2025?
[00:03:11 ] Simon: I don't particularly for that, like, the models, but smarter. I think all of the trends we're seeing right now are going to keep on going, especially the inference time compute, right?
[00:03:21 ] Simon: The trick that O1 and O3 are doing, which means that you can solve harder problems, but they cost more and it churns away for longer. I think that's going to happen because that's already proven to work. I don't know. I don't know. Maybe there will be a step change to a GPT 5 level, but honestly, I'd be completely happy if we got what we've got right now.
[00:03:41 ] Simon: But cheaper and faster and more capabilities and longer contexts and so forth. That would be thrilling to me.
[00:03:46 ] Brian: Digging into what you've just said one of the things that, by the way, I hope to link in the show notes to Simon's year end post about what, what things we learned about LLMs in 2024. Look for that in the show notes.
[00:03:59 ] Cost Efficiency in AI
[00:03:59 ] Brian: One of the things that you [00:04:00 ] did say that you alluded to even right there was that in the last year, you felt like the GPT 4 barrier was broken, like IE. Other models, even open source ones are now regularly matching sort of the state of the art.
[00:04:13 ] Simon: Well, it's interesting, right? So the GPT 4 barrier was a year ago, the best available model was OpenAI's GPT 4 and nobody else had even come close to it.
[00:04:22 ] Simon: And they'd been at the, in the lead for like nine months, right? That thing came out in what, February, March of, of 2023. And for the rest of 2023, nobody else came close. And so at the start of last year, like a year ago, the big question was, Why has nobody beaten them yet? Like, what do they know that the rest of the industry doesn't know?
[00:04:40 ] Simon: And today, that I've counted 18 organizations other than GPT 4 who've put out a model which clearly beats that GPT 4 from a year ago thing. Like, maybe they're not better than GPT 4. 0, but that's, that, that, that barrier got completely smashed. And yeah, a few of those I've run on my laptop, which is wild to me.
[00:04:59 ] Simon: Like, [00:05:00 ] it was very, very wild. It felt very clear to me a year ago that if you want GPT 4, you need a rack of 40, 000 GPUs just to run the thing. And that turned out not to be true. Like the, the, this is that big trend from last year of the models getting more efficient, cheaper to run, just as capable with smaller weights and so forth.
[00:05:20 ] Simon: And I ran another GPT 4 model on my laptop this morning, right? Microsoft 5. 4 just came out. And that, if you look at the benchmarks, it's definitely, it's up there with GPT 4. 0. It's probably not as good when you actually get into the vibes of the thing, but it, it runs on my, it's a 14 gigabyte download and I can run it on a MacBook Pro.
[00:05:38 ] Simon: Like who saw that coming? The most exciting, like the close of the year on Christmas day, just a few weeks ago, was when DeepSeek dropped their DeepSeek v3 model on Hugging Face without even a readme file. It was just like a giant binary blob that I can't run on my laptop. It's too big. But in all of the benchmarks, it's now by far the best available [00:06:00 ] open, open weights model.
[00:06:01 ] Simon: Like it's, it's, it's beating the, the metalamas and so forth. And that was trained for five and a half million dollars, which is a tenth of the price that people thought it costs to train these things. So everything's trending smaller and faster and more efficient.
[00:06:15 ] Brian: Well, okay.
[00:06:16 ] Challenges and Competition in AI
[00:06:16 ] Brian: I, I kind of was going to get to that later, but let's, let's combine this with what I was going to ask you next, which is, you know, you're talking, you know, Also in the piece about the LLM prices crashing, which I've even seen in projects that I'm working on, but explain Explain that to a general audience, because we hear all the time that LLMs are eye wateringly expensive to run, but what we're suggesting, and we'll come back to the cheap Chine